Topmoumoute Online Natural Gradient Algorithm
نویسندگان
چکیده
Natural gradient is a gradient descent technique which uses the inverse of the covariance matrix of the gradient. Using the centrallimit theorem, we prove that it yields the direction that minimizes the probability of overfitting. However, its prohibitive computational cost makes it impractical for online training. Here, we present a new online version of the natural gradient which we coin TONGA (Topmoumoute Online Natural Gradient Algorithm).
منابع مشابه
A Stochastic Quasi-Newton Method for Online Convex Optimization
We develop stochastic variants of the wellknown BFGS quasi-Newton optimization method, in both full and memory-limited (LBFGS) forms, for online optimization of convex functions. The resulting algorithm performs comparably to a well-tuned natural gradient descent but is scalable to very high-dimensional problems. On standard benchmarks in natural language processing, it asymptotically outperfor...
متن کاملMatrix momentum for practical natural gradient learning
An on-line learning rule, based on the introduction of a matrix momentum term, is presented, aimed at alleviating the computational costs of standard natural gradient learning. The new rule, natural gradient matrix momentum, is analysed in the case of two-layer feed-forward neural network learning viamethods of statistical physics. It appears to provide a practical algorithm that performs as we...
متن کاملLETTER Communicated by Sun - Ichi AmariOn \ Natural " Learning and Pruning in Multilayered PerceptronsTom
Several studies have shown that natural gradient descent for on-line learning is much more eecient than standard gradient descent. In this paper, we derive natural gradients in a slightly diierent manner and discuss implications for batch-mode learning and pruning, linking them to existing algorithms such as Levenberg-Marquardt optimization and optimal brain surgeon. The Fisher matrix plays an ...
متن کاملPii: S0165-1684(01)00146-3
In this paper, we study convergence and e ciency of the batch estimator and natural gradient algorithm for blind deconvolution. First, the blind deconvolution problem is formulated in the framework of a semiparametric model, and a family of estimating functions is derived for blind deconvolution. To improve the learning e ciency of the online algorithm, explicit standardized estimating function...
متن کاملNeural Learning in Structured Parameter Spaces - Natural Riemannian Gradient
The parameter space of neural networks has a Riemannian metric structure. The natural Riemannian gradient should be used instead of the conventional gradient, since the former denotes the true steepest descent direction of a loss function in the Riemannian space. The behavior of the stochastic gradient learning algorithm is much more effective if the natural gradient is used. The present paper ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007